Metacost: a General Method for Making Classiiers Cost-sensitive
نویسنده
چکیده
Research in machine learning, statistics and related elds has produced a wide variety of algorithms for classiication. However, most of these algorithms assume that all errors have the same cost, which is seldom the case in KDD problems. Individually making each classiication learner cost-sensitive is laborious, and often non-trivial. In this paper we propose a principled method for making an arbitrary classi-er cost-sensitive by wrapping a cost-minimizing procedure around it. This procedure, called MetaCost, treats the underlying classiier as a black box, requiring no knowledge of its functioning or change to it. Unlike stratiication, Meta-Cost is applicable to any number of classes and to arbitrary cost matrices. Empirical trials on a large suite of benchmark databases show that MetaCost almost always produces large cost reductions compared to the cost-blind classiier used (C4.5RULES) and to two forms of stratiication. Further tests identify the key components of MetaCost and those that can be varied without substantial loss. Experiments on a larger database indicate that MetaCost scales well.
منابع مشابه
On Class-probability Estimates and Cost-sensitive Evaluation of Classiiers 1. Class-probability Estimates
This paper addresses two cost-sensitive learning methodology issues. First, we ask the question of whether Bagging is always an appropriate procedure to compute accurate class-probability estimates for cost-sensitive classiication. Second, we will point the reader to a potential source of erroneous results in the most common procedure of evaluating cost-sensitive classiiers when the real miscla...
متن کاملBuilding Ensembles of Classi ers for Loss Minimization
One of the most active areas of research in supervised learning has been the study of methods for constructing good ensembles of classiiers, that is, a set of classi-ers whose individual decisions are combined to increase overall accuracy of classifying new examples. In many applications classiiers are required to minimize an asym-metric loss function rather than the raw misclassiication rate. ...
متن کاملAn Efficient Predictive Model for Myocardial Infarction Using Cost-sensitive J48 Model
BACKGROUND Myocardial infarction (MI) occurs due to heart muscle death that costs like human life, which is higher than the treatment costs. This study aimed to present an MI prediction model using classification data mining methods, which consider the imbalance nature of the problem. METHODS We enrolled 455 healthy and 295 myocardial infarction cases of visitors to Shahid Madani Specialized ...
متن کاملInducing Cost-Sensitive Non-Linear Decision Trees
This paper presents a new decision tree learning algorithm that takes account of costs of misclassification. The algorithm is based on the hypothesis that non-linear decision nodes provide a better basis for cost-sensitive induction than axis-parallel decision nodes and utilizes discriminant analysis to construct non-linear cost-sensitive decision trees. The performance of the algorithm is eval...
متن کاملClassification cost: An empirical comparison among traditional classifier, Cost-Sensitive Classifier, and MetaCost
Loan fraud is a critical factor in the insolvency of financial institutions, so companies make an effort to reduce the loss from fraud by building a model for proactive fraud prediction. However, there are still two critical problems to be resolved for the fraud detection: (1) the lack of cost sensitivity between type I error and type II error in most prediction models, and (2) highly skewed di...
متن کامل